Near-Optimal (Euclidean) Metric Compression

نویسندگان

  • Piotr Indyk
  • Tal Wagner
چکیده

The metric sketching problem is defined as follows. Given a metric on n points, and > 0, we wish to produce a small size data structure (sketch) that, given any pair of point indices, recovers the distance between the points up to a 1 + distortion. In this paper we consider metrics induced by `2 and `1 norms whose spread (the ratio of the diameter to the closest pair distance) is bounded by Φ > 0. A well-known dimensionality reduction theorem due to Johnson and Lindenstrauss yields a sketch of size O( −2 log(Φn)n log n), i.e., O( −2 log(Φn) log n) bits per point. We show that this bound is not optimal, and can be substantially improved to O( −2 log(1/ ) · log n+ log log Φ) bits per point. Furthermore, we show that our bound is tight up to a factor of log(1/ ). We also consider sketching of general metrics and provide a sketch of size O(n log(1/ ) + log log Φ) bits per point, which we show is optimal. ∗[email protected]. †[email protected]. ar X iv :1 60 9. 06 29 5v 2 [ cs .C G ] 1 8 N ov 2 01 6

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of the Log-Euclidean Metric Performance in Diffusion Tensor Image Segmentation

Introduction: Appropriate definition of the distance measure between diffusion tensors has a deep impact on Diffusion Tensor Image (DTI) segmentation results. The geodesic metric is the best distance measure since it yields high-quality segmentation results. However, the important problem with the geodesic metric is a high computational cost of the algorithms based on it. The main goal of this ...

متن کامل

An Eecient, Geometric Approach to Rigid Body Motion Interpolation

This paper develops a method for generating tra-jectories for a rigid body with speciied boundary conditions at the end points and intermediate locations. It is well known that SE(3), the set of all rigid body positions and orientations, is a non Euclidean space. In general, the problem of determining trajectories that are optimal with respect to some meaningful metric does not have an analytic...

متن کامل

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

Tangent Bundle of the Hypersurfaces in a Euclidean Space

Let $M$ be an orientable hypersurface in the Euclidean space $R^{2n}$ with induced metric $g$ and $TM$ be its tangent bundle. It is known that the tangent bundle $TM$ has induced metric $overline{g}$ as submanifold of the Euclidean space $R^{4n}$ which is not a natural metric in the sense that the submersion $pi :(TM,overline{g})rightarrow (M,g)$ is not the Riemannian submersion. In this paper...

متن کامل

An Iterative Algorithm for Two-Dimensional Digital Least Metric Problems with Applications to Digital Image Compression

A correspondence between the problem of twodimensional digital least-metric (DLM) tting and data detection in serially concatenated systems in digital communication theory is described. Nearly optimal detection algorithms based on recent advances in iterative detection/decoding are applied to the DLM problem for two applications in digital image compression. The rst application is least squares...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017